Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for rigid type variables #560

Merged
merged 8 commits into from
Apr 18, 2024

Conversation

xxdavid
Copy link
Collaborator

@xxdavid xxdavid commented Jan 22, 2024

Hi!

I had almost no time to work on the local type inference (#553) in the past months, but I finally found some time last week. After thinking about the implementation, I've decided that a good first step towards LTI would be to add the missing support for rigid type variables. This way we can check bodies of polymorphic functions. And here's a PR just for that.

Rigid type variables are usually treated as type you know nothing about. You cannot assume anything about them and almost no subtyping rules apply to them (except for reflexivity, top, bottom, and any()). When typechecking a function, all of the type variables mentioned in its spec should be considered as rigid, because they can be anything (it is up to the caller what they are instantiated to (in rank-1 polymorphism)).

With this, we can discover type errors like this one:

-spec id(A) -> A.
id(_X) -> 42.

The implementation in this PR tells us that in this case:

The integer on line 2 at column 11 is expected to have type A but it has type 42

When implementing rigid type variables, we have to somehow distinguish between flexible and rigid type variables. I came up with three possible ways of doing so:

  1. Adding a new type() shape: {rigid_type, anno(), atom()}
  2. Adding a new anno() shape: {is_rigid, boolean()}
  3. Keeping the kind (flexible/rigid) of type variables in the current scope in Env

I turned down option number two, as we could not use pattern matching or guards to distinguish between flexible and rigid type variables. Option three could come with a macro guard is_flexible(Var, Env) (which is nice) but would require us to use it in every place where type variables are used in typechecking. This seems quite fragile to me, and thus I went for the option number one. A good thing about this approach is that it required no changes to the typechecking core (subtype, type_check_expr, etc.), only a few utility functions were enriched with a new clause for rigid_var. The only hassle was that erl_pp doesn't expect rigid_var and prints INVALID-FORM in that case, so we have to transform every type before printing it, but this seems okay to me.

I added tests for rigid type variables and I think it works really well. I also moved poly_2 test functions with explanation in the commit message, feel free to react to that as well.

What do you think about it? @erszcz @zuiderkwast

Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome!

I didn't really know how to implement and use rigid type vars but looks very strait forward.

@@ -308,6 +313,8 @@
-type gr_type_var() :: atom() | string().
-type af_type_variable() :: {'var', anno(), gr_type_var()}.

-type gr_rigid_type_variable() :: {'rigid_var', anno(), atom()}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since most of this file is copied from OTP, we want to explain all our changes with comments. The prefix "gr" is good but maybe something more about why we want to add this type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added the missing comment, together with a forgotten doc for the make_rigid_type_vars/1 (with a slightly different wording): 5a18b8e. Is it better now?

@@ -338,6 +345,8 @@
[af_lit_atom('is_subtype') |
[af_type_variable() | abstract_type()]]}. % [IsSubtype, [V, T]]

-type gr_any_type() :: {'type', anno(), 'any', []}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add this?

I think it's covered by this definition:

-type af_predefined_type() ::
        {'type', anno(), type_name(),  [abstract_type()]}.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily needed. I used it here to emphasise that fenv carries either a spec or simply any() (wrapped in a list) for each function (as it wasn't very apparent to me). We could inline it there like this:

-record(env, {fenv              = #{}   :: #{{atom(), arity()} =>
                                                  [gradualizer_type:af_constrained_function_type()]
                                                | [{'type', gradualizer_type:anno(), 'any', []}]
                                            },

but I'm not sure whether it's better. What would you do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'm fine with any. :-)

test/should_fail/opaque_fail.erl Outdated Show resolved Hide resolved
@xxdavid
Copy link
Collaborator Author

xxdavid commented Jan 23, 2024

Thank you for the review! 🙂

Until this commit, we had no notion of rigid type variables
in Gradualizer. All type variables were treated as flexible
and thus errors in polymorphic functions were not discovered.
Even obvious errors like
    -spec id(A) -> A.
    id(_X) -> banana.
were left unreported.

This commit adds a new syntactic form of `type()`:
`{rigid_var, anno(), atom()}`. All type variables mentioned
in a spec are convert to rigid_vars during typechecking
of the particular function. Rigid type variables have no
subtypes or supertypes (except for itself, top, none(),
and any()), and therefore we treat them as types we know
nothing about (as they can be anything; their type
determines the caller, not the callee).

I've also added some missing cases to `subst_ty/2`.
All these function coming from the paper "Bidirectional Typing
for Erlang" (N. Rajendrakumar, A. Bieniusa, 2021) require
higher-rank (at least rank-2) polymorphism. To be able to
have higher-ranked polymorphism, we would have to be able to
express inner quantifiers. We don't have that in the typespec
syntax, so the traditional way of interpreting type variables
in specs is to treat them all as quantified at the top-level.

Authors of the paper took a non-standard approach, to quote it:

    Type specifications in Erlang consider all type variables
    as universal and thus restrict the types to prefix or rank-1
    polymorphism.

    For allowing higher-rank polymorphism, we are interpreting
    the type specification differently, namely we are adding
    the quantifier to the type variable with the smallest scope.

    -spec poly_2(fun((A) -> A)) -> {boolean(), integer()}.

    For example, the type spec for function poly_2 above is
    interpreted as
        (∀a.(a) → a) → {boolean(), integer()}
    instead of
        ∀a.((a → a) → {boolean(), integer()})

I argue that this interpretation of type variables would
be confusing for users. Additionally, I think that there
is not much value in having higher-rank polymorphism in
a dynamically/gradually typed language. In Haskell, it
is useful for sure (even though one does not encounter
it very often) but this is because there is a strict
typechecker that doesn't accept anything you cannot type
well. Here, in Erlang, if you ever come to a situation
where you need higher-rank polymorphism (and I think it
is quite unlikely), you can always fallback to any().
@xxdavid
Copy link
Collaborator Author

xxdavid commented Apr 18, 2024

Hey guys, could we please move this forward? 🙂 Is there anything left you want to change or comment on?

I've just fixed the authorship of the commits (there was a wrong email address used in them; otherwise the commits are the same) and force-pushed them. I also added one small fixup commit (a103f65).

Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @xxdavid, I think me and @erszcz are too busy with other projects.

I don't have a full understanding of how rigid type variables affects other ideas such as constraint solver, but I agree you're solving important problems here.

Please add relevant documentation. We use the wiki for documentation: https://github.com/josefs/Gradualizer/wiki. I hope you have write permissions to the wiki.

@zuiderkwast zuiderkwast merged commit ac139bc into josefs:master Apr 18, 2024
5 checks passed
@xxdavid
Copy link
Collaborator Author

xxdavid commented Apr 18, 2024

@zuiderkwast I get it. Thank you for merging the PR.

Okay, I'll add a chapter related to Polymorphism to the wiki. I think it can cover both directions: checking the polymorphic functions itself (this involves the rigid type variables) and checking calls to such polymorphic functions (this will hopefully use the LTI approach I described in #553, I will make a PR for that soon). But I am afraid I don't have an edit permission to the wiki.

@zuiderkwast
Copy link
Collaborator

@josefs, do you want to give write permissions to @xxdavid in this repo?

He is currently the most active person in this project.

Alternatively, allow everyone to edit the wiki pages by unchecking "Restrict editing to collaborators only" under settings:

image

Thanks.

@josefs
Copy link
Owner

josefs commented Apr 18, 2024

@josefs, do you want to give write permissions to @xxdavid in this repo?

Done

@xxdavid
Copy link
Collaborator Author

xxdavid commented Apr 18, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants